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Abstract. The factor graph of an instance of a symmetric constraint 
satisfaction problem on n Boolean variables and m constraints (CSPs 
such as k-SAT, k-AND, k-LIN) is a bipartite graph describing which 
variables appear in which constraints. The factor graph describes the 
instance up to the polarity of the variables, and hence there are up to 
2 fcm instances of the CSP that share the same factor graph. It is well 
known that factor graphs with certain structural properties make the 
underlying CSP easier to either solve exactly (e.g., for tree structures) 
or approximately (e.g., for planar structures). We are interested in the 
following question: is there a factor graph for which if one can solve every 
instance of the CSP with this particular factor graph, then one can solve 
every instance of the CSP regardless of the factor graph (and similarly, 
for approximation)? We call such a factor graph universal. As one needs 
different factor graphs for different values of n and m, this gives rise to 
the notion of a family of universal factor graphs. 

We initiate a systematic study of universal factor graphs, and present 
some results for max-feSAT. Our work has connections with the notion 
of preprocessing as previously studied for closest codeword and closest 
lattice- vector problems, with proofs for the PCP theorem, and with tests 
for the long code. Many questions remain open. 



1 Introduction 

A constraint satisfaction problem (CSP) has a set of n variables and a set of 
m constraints (also referred to as clauses, or factors). Every constraint involves 
a subset of the variables, and is satisfied by some assignments to the variables 
and not satisfied by others. An instance of a CSP is satisfiable if there is an 
assignment to the variables that satisfies all constraints. When variables are 
Boolean and constraints are symmetric a constraint is fully specified by the set 
of literals that it contains (where a literal is either a variable or its negation), 
and is satisfied if and only if the appropriate number of literals is set to true 
(e.g., at least one for SAT, an odd number for XOR, all for AND, the majority 
for MAJ, and at least one but not all for NAE). To simplify the presentation, 
we shall consider in this paper CSPs that are Boolean and symmetric, though 
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we remark that much of what we discuss can be extended to non-Boolean and 
non-symmetric CSPs. 

The factor graph of an instance of a CSP is a bipartite graph. Vertices on one 
side represent the variables, vertices on the other side represent the constraints 
(also known as factors), and edges connect constraints to the variables that they 
contain. For Boolean symmetric CSPs, a factor graph together with a labeling 
of the edges with ±1 (indicating whether the corresponding variable has positive 
or negative polarity in the underlying clause) completely specifies an instance 
of the CSP. Without the edge labels, there are many instances of the CSP that 
share the same factor graph and differ only in the polarity of the variables. 

As is well known, deciding satisfiability for CSPs is NP-hard for a large 
class of predicates (including, SAT, MAJ and NAE). See [24] for a complete 
classification. Here we shall consider NP-hard CSPs. The research question that 
motivates our current paper is to understand what are the obstacles for obtaining 
efficient algorithms for solving CSPs. Specifically, are algorithms having trouble 
in "understanding" the structure of the factor graph, and this translates to 
difficulties in solving the underlying CSP? Alternatively, are the computational 
difficulties a result of the combinatorial richness of the polarities? 

The structure of the factor graph may cause the underlying CSP instance to 
be easy. For example, if the factor graph is a tree (or more generally, of bounded 
treewidth), then the underlying CSP instance can be solved in polynomial time 
(by dynamic programming). Our research question (once properly formalized) 
can be viewed as asking whether in other cases, the structure of the factor graph 
might be the major contributing factor to making a CSP hard. 

The playing field of our research agenda is greatly enriched once optimiza- 
tion versions of CSPs are considered, namely max-CSP: find an assignment to 
the variables that satisfies as many constraints as possible. As is well known, 
even some polynomial time solvable CSPs (such as XOR, or 2SAT) become NP- 
hard when their optimization version is considered. See [8] for a classification. A 
standard way of dealing with NP-hard max-CSP instances is via approximation 
algorithms that in polynomial time find an assignment that is guaranteed to 
satisfy a number of constraints that is at least p times the maximum number 
of constraints that can be satisfied, for some < p < 1. For many CSPs, the 
best possible p is known, in the sense that the approximation ratios provided 
by known approximation algorithms are matched by hardness of approximation 
results that show that better approximation ratios would imply that P=NP. For 
example, p = 7/8 is a tight approximation threshold for max-3SAT [12]. More- 
over, for all CSPs, an algorithm (based on scmidefinitc programming) with the 
optimal approximation ratio is given by Raghavendra [22], assuming the Unique 
Games Conjecture of Khot [15]. However, despite the optimality of this algo- 
rithm, it is difficult to figure out which approximation ratio it guarantees, and 
consequently there are CSPs for which the value of this threshold is not known. 
(And of course, if the Unique Games Conjecture is false then the approximation 
ratio implied by this algorithm need not be tight.) 
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Our research agenda naturally extends to max-CSP. One may ask whether 
approximation algorithms are having trouble in "understanding" the structure of 
the factor graph, and whether this translates to difficulties in approximating the 
underlying CSP. Moreover, now the question acquires also a quantitative aspect, 
and one may ask to what extent does the factor graph contribute to the approxi- 
mation difficulty. For example, if algorithms had no difficulty in "understanding" 
factor graphs, could the approximation ratio for max-3SAT be improved from 
7/8 to 8/9? 

As in the case of tree factor graphs for decision versions, there are known 
families of factor graphs (such as planar graphs, or more generally, families of 
graphs excluding a fixed minor) on which the underlying CSP instance has im- 
proved approximation ratios, or even a PTAS (p > 1 — e for every e > 0). On the 
other hand, it appears that for some CSPs, almost every factor graph is difficult. 
For example, there is no known approximation algorithm that runs in polyno- 
mial time on random 3CNF formulas (with say m — n log n constraints) and 
approximates max-3SAT within a ratio better than 7/8. This suggests (though 
does not prove) that there is no need for clever design of the factor graph in 
order to make the underlying CSP instance difficult - almost any factor graph 
would do. 

In contrast, for unique games (which is a special family of CSPs with two 
non-Boolean variables per constraint), the approximation ratios achievable on 
random factor graphs [4] are much better than those currently known to be 
achievable on arbitrary factor graphs. (Technically, the graphs considered by 
Arora et al. [4] have variables as vertices and constraints as edges, but there is 
a one-to-one correspondence between such graphs and factor graphs.) The same 
holds for some other classes of graphs [25117) . Can we (and should we) identify 
more factor graphs on which unique games are easy? Is there a "universal" graph 
(e.g., a generalized Kncscr constraint graph?) such that if unique games are easy 
on it, then the Unique Games Conjecture is false? Such questions lead naturally 
to the notion that we call here universal factor graphs. 

1.1 Preprocessing 

How can we provide evidence that algorithms for max-3SAT should be spend- 
ing substantial time in analyzing the factor graph? Here is a possible formal 
approach. Reveal the input instance in two stages. In the first stage, only the 
factor graph is revealed. At this point the algorithm is allowed to run for ar- 
bitrary time and record (in polynomial space) whatever information about the 
factor graph that it may hope to find useful (e.g., an optimal tree decomposition 
of the factor graph, or a minimum dominating set in the factor graph, both of 
which are pieces of information that take exponential time to compute) . There- 
after the polarities of the variables are revealed. At this stage the algorithm has 
only polynomial time, and it needs to find an optimal solution to the max-3SAT 
instance. If there is a combination of algorithms (unbounded time for stage 1, 
polynomial time for stage 2) that can do this on every instance, this establishes 
that a good understanding of the factor graph suffices for solving 3SAT instances. 
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If this cannot be done, this establishes that at least some substantial portion of 
the running time is a result of the combinatorial richness of space of possibilities 
for polarities of the variables. Refined versions of the preprocessing approach ei- 
ther require less of the stage 2 algorithm (finding nearly optimal solutions rather 
than optimal ones) or give it extra power (allow subexponcntial time), and may 
lead to a more quantitative understanding of the value of preprocessing. 

To derive positive results in this model, it suffices to provide the respective 
algorithms and their analysis. But how does one provide negative results? This 
is where the notion of universal factor graphs comes in. Informally, these are 
factor graphs on which preprocessing is unlikely to help, because if it does, 
then all instances (regardless of their factor graph) can be solved even without 
preprocessing. 

1.2 Universal Factor Graphs 

We consider infinite families of factor graphs. Basically, for every value of N,M > 
0, a family includes at most one factor graph with N variables and M constraints. 
However, for convenience in intended future uses, members of the family are 
indexed by two auxiliary indices that are called n and m. Definition Q] does not 
exclude the possibility that several factor graphs in the family share the same 
values of N and M, but their number is upper bounded by some polynomial in 
N + M. 

Definition 1. Consider an arbitrary CSP with k variables per-constraint. For 
integers n > and < m < 2 k (?) , let N(n, m) and M(n, m) be two functions, 
each lower bounded by n and upper bounded by a polynomial in n + m. A family 
of factor graphs associates with each pair of values of n and m a factor graph 
with N{n,m) variables and M[n,m) constraints. The family is uniform if there 
is an algorithm running in time polynomial in n + m that given n, m produces 
the associated factor graph. 

Every member of a family of factor graphs for a /c-CSP can give rise to 2 kM 
instances of the CSP, depending on how one sets the polarities of the variables 
in the constraints. Given any such instance as input, we shall consider compu- 
tational tasks such as satisfiability (find a satisfying assignment if one exists), 
optimization (find an assignment satisfying as many clauses as possible) and 
approximation (get close to optimal). 

The algorithms that perform the above tasks will be limited in their running 
times. In this work, we shall be interested in two classes of running times. One is 
the standard polynomial time (P) notion, which in our case will mean polynomial 
in (N + M). The other is subexponential time, (SUBEXP) which in this paper 
is taken to mean time time 2°^ > for some e > 0. 

Recall that in computational complexity theory, one distinguishes between 
uniform models of computation (such as Turing machines) and non-uniform mod- 
els (such as families of circuits). This distinction is relevant in our context. The 
notion of preprocessing the factor graph can be captured by allowing for nonuni- 
form algorithms. Hence we shall be dealing with the complexity classes P/poly, 



5 



SUBEXP/poly and SUBEXP/subexp (the parameters /poly and /subexp corre- 
spond to the length of advice that the preprocessing stage is allowed to record) . 
For simplicity in our presentation, in each of our definitions below we shall specify 
one particular complexity class (either P/poly or SUBEXP/poly), but we note 
that our results extend to other complexity classes as well (such as P instead of 
P/poly, or SUBEXP/subexp instead of SUBEXP/poly). 

In this work we will show that for some uniform families of factor graphs 
solving satisfiability or approximation tasks are hard. These families of factor 
graphs will be referred to as universal, and with slight abuse of terminology, 
individual factor graphs within these families will be referred to as universal 
factor graphs. The hardness results will be proved under some complexity as- 
sumption. If the complexity assumption is widely believed, such as that NP is 
not contained in P/poly, then the universal factor graphs support the view that 
the complexity of the underlying CSP cannot be attributed entirely to the factor 
graph and is at least partly due to the polarities of the variables, because the 
nonuniform algorithms could preprocess the factor graph for arbitrary time prior 
to receiving the polarities of the variables. If the complexity assumption is not 
as widely believed (such as the Unique Games Conjecture) , the interpretation of 
these hardness result can be that if one wishes to refute the complexity assump- 
tion, it would suffice to design algorithms that are specifically tailored to work 
on instances with factor graphs as in the universal family. 

We now present formal definitions that are tailored to match those results 
that we can prove in this paper. It is straightforward to adapt these definitions 
to other variations as well. 

Definition 2. For a given CSP, a uniform family of factor graphs is P-universal 
if there is no P/poly algorithm for instances of the CSP with factor graphs from 
this family, unless NP is contained in P/poly. 

Definition 3. For a given CSP, a uniform family of factor graphs is subexp- 
universal if there is no SUBEXP/poly algorithm for instances of the CSP with 
factor graphs from this family, unless there is a SUBEXP/poly algorithm for all 
instances of the CSP. 

Definition 4. For a given CSP and < p < I, a uniform family of factor 
graphs is p-universal if there is no P/poly approximation algorithm with ap- 
proximation ratio better than p on the instances of the CSP with factor graphs 
from this family, unless NP is contained in P/poly. This notion is referred to as 
threshold- universal. If p is equal to the best approximation ratio known for the 
underlying CSP, we will refer to this as a tight threshold. When we do not wish 
to specify a particular value for p, we call the family APX- universal. A variation 
on p-universality is (c, s) -universality with < s < c < 1, where instead of ap- 
proximation within a ratio of p, one considers distinguishing between instances 
with at least a c-fraction of the clauses being satisfiable, and instances with at 
most s -fraction being satisfiable. For a CSP for which the decision variant is 
NP-hard (e.g. SSAT), p-universality will be taken to mean (1, p) -universal. 
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More generally, for optimization versions we shall allow vertices (representing 
constraints) of universal factor graphs to have nonnegative weights, thus repre- 
senting instances in which one wishes to find an assignment that maximizes the 
weight (rather than the number) of satisfied constraints. As the weights will 
be fixed (independently of the subsequent polarities given to variables) , this is 
in essence a condensed representation of an unweighted universal factor graph 
(which can be obtained by duplicating each vertex a number of times propor- 
tional to its weight, rounded to the nearest integer - details omitted). 

1.3 Some Research Goals 

The notion of universal factor graphs opens up many research directions that 
we find interesting. In our current work we attempt to answer questions such 
as: Does 3SAT have P-universal factor graphs? Subexp-universal factor graphs? 
Does max-3SAT have APX-universal factor graphs? Does max-3SAT have 7/8- 
universal factor graphs? These questions are part of a wider research agenda 
that concerns questions such as: Do all CSPs have tight threshold-universal 
factor graphs? Which CSPs do not have tight threshold-universal factor graphs? 
Other questions of interest include: How do universal factor graphs look like? 
Can knowledge of their structure help us either in designing new algorithms, or 
in reductions that prove new hardness results? 

1.4 Related Work 

There has been work showing that CSPs on particular factor graphs are NP-hard, 
and using such results to help in reductions establishing further NP-hardness 
results. For example, it is known that 3SAT is NP-hard even when the factor 
graph is planar |18j , and this was used (for example) in showing that minimum- 
length rectangular partitioning of a rectilinear polygon (with holes) is NP-hard 
|19j . Our notion of universal factor graphs is stronger as it requires at most 
one particular factor graph for each instance size, rather than a whole family of 
factor graphs (e.g., the n by n grid, rather than all planar graphs). 

A line of work that closely relates to our research agenda is that of prepro- 
cessing for NP-hard problems. As the universal factor graph is fixed, one may 
consider preprocessing it for arbitrary (exponential) time in order to produce a 
polynomial size "advice" , prior to getting the polarities of the variables. Pre- 
processing was extensively studied for some NP-hard problems, and hardness 
results in the context of preprocessing amount to designing instances that are 
universal (in our terminology). Naor and Bruck [7] show that the nearest code 
word problem remains NP-hard even when the code can be preprocessed. Near- 
est lattice vector (CVP) when the lattice can be preprocessed was shown to 
be NP-hard and APX-hard by Feige and Micciancio [TO]. The tightest hardness 
results for lattice problems with preprocessing currently known are by Khot et 
al. [16] . An earlier work by Alekhnovich et al. [2] has some partial overlap with 
our current work, because it uses PCP theory and in the process gives hardness 
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of approximation results with preprocessing for additional problems. See more 
details in Section 12721 

The above results on coding and lattice problems with preprocessing are 
motivated by the fact that in these problems, it is indeed often the case that 
part of the input is fixed in advance (the code, or a basis for the lattice), and 
part of the input (a noisy word that one wishes to decode, or vector for which 
one wishes to find the closest lattice point) is a query that is received only later. 
Moreover, multiple queries are expected to be received on the same fixed input. 
In these cases it really makes sense to invest much time in preprocessing the 
fixed part of the input, if this later helps answering the multiple queries more 
quickly. In contrast, our notion of universal factor graphs is independent of such 
practical concerns. Our motivation is to understand the source of difficulties 
in solving NP-hard problems. In particular, it is irrelevant to us whether there 
really is any real life situation in which one receives the factor graph of a 3CNF 
formula in advance, and then is asked a sequence of queries about it, each time 
with different polarities of the variables. 

Is it at all plausible that preprocessing can help? For lattice problems, this in- 
deed appears to be the case. There are no known approximation algorithms with 
subexponential ratios for CVP, but if preprocessing is allowed, than polynomial 
approximation ratios are known (by using an exponential time preprocessing 
procedure that derives a so called reduced basis of the lattice). For CSPs, the au- 
thors are aware of only much weaker evidence that preprocessing may help. This 
relates to the case that polarities of variables are random rather than arbitrary. 

There is a refutation algorithm that is poly-time on random 3CNF formulas 
with more than n 15 clauses. The obstacle to extending this to lower density of 
n 1A is graph-theoretic: if one knew how to efficiently find certain substructures in 
the factor graphs (that almost surely exist), this would suffice [11]. Preprocessing 
the factor graph would allow finding these structures. Hence at these densities, 
random factor graphs are not expected to be universal (with respect to random 
polarities). 

In the current paper we consider arbitrary polarities for the variables rather 
than random polarities. Nevertheless, we remark that the case of random polar- 
ities is also well motivated, and related to possible cryptographic application. 
See [3j as an example showing how results from can be used in a proposal 
of new public key cryptographic primitives. 

More generally, cryptography offers many examples where preprocessing is 
believed to help (it will lead to the discovery of a so called trapdoor that would 
make solving future instances easy), but as this typically relates to computa- 
tional problems that are believed not to be NP-hard, further discussion of this 
is omitted from the current manuscript. 

1.5 Our Results 

The first theorem is based on a straightforward reduction and we have no doubt 
that it was previously known (perhaps using different terminology). 
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Theorem 1. There are P-universal factor graphs for 3SAT. 

For the P-universal factor graphs constructed by our proof for Theorem [1] 
an algorithm running in time 2 N on instances of the universal family would 
correspond to time 2 n on general instances. Hence they are not subexp- 
universal. The next theorem addresses this issue. 

Theorem 2. There are subexp- universal factor graphs for 3SAT. 

We would have liked to prove that there are 7/8-universal factor graphs 
for max-3SAT, matching the tight threshold of approximability for max-3SAT. 
However, we only managed to prove weaker bounds. 

Theorem 3. There are 77 / '80 -universal factor graphs for max-SSAT. 

Is there any CSP for which we can obtain tight threshold-universal families? 
We do not know, but we do have almost tight results. 

Theorem 4. For every e > there is an integer k for which there is a family 
of factor graphs that are (1 — (1 — e) 2~ fe ) -universal for max-EkSAT. 

Theorem [4] in nearly tight because every instance of max-EfcSAT is (1 — 
2~ fc )-satisfiable, and consequently there are several algorithms with a (1 — 2~ k ) 
approximation ratio. To actually get tight results we would need to switch the 
order of quantifiers in Theorem[4](show that for some k the result holds for every 
e), but doing so remains an open question. 

Using the techniques developed in our work and known reductions among 
CSPs one can obtain APX-universal factor graphs for additional CSPs. In par- 
ticular, we derive APX-universal factor graphs for max-2LIN, thus illustrating 
that for approximating unique games (max-2LIN is a unique game) at least 
part of the difficulty comes from the polarities of variables rather than from the 
structure of the factor graph. See Appendix [F] 

2 Overview of proofs 

At a high level, to show that a factor graph is universal, one shows that any 
other factor graph (of the appropriate size) can be reduced to it. The details of 
how this is done depend on the context. 

The proof of Theorem Q] appears in Appendix [SJ It is elementary and can 
serve as an introduction to some of the more complicated proofs that follow. 

2.1 Subexp-Universal Families 

Our proof of Theorem [2] combines two ingredients. One is a variation on a result 
of Impagliazzo et al. [M] (see Lemma [2] in Appendix |B|) . It can be leveraged 
to show that for the purpose of constructing subexp-universal factor graphs it 
suffices to consider 3CNF instances with a linear number of clauses. 

The other ingredient is a reduction with a tighter connection between n + m 
and N compared to the one used in our proof of Theorem [1] 
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Lemma 1. There is a factor graph with N = 0(m log m log n) variables that is 
P-universal with respect to 3SAT instances with n variables and m clauses. 

Our proof of Lemma [T] makes use of oblivious sorting networks (specifically, the 
one of Ajtai et al. Q]). 

More details on those two ingredients and how they are combined to prove 
Theorem [2] appear in appendix IB1 

2.2 Threshold-Universal Families 

For our proof of Theorem[3]we use a notion that we call a factor graph preserving 
reduction (FGPR). It is an algorithm that transforms a source 3CNF instance f s 
to a target 3CNF instance f t . The transformation has the following properties: 

1. Polynomiality. The transformation algorithm runs in polynomial time (in 
the size of / s ). Consequently, the size of f t is polynomial in the size of f s . 

2. Faithfulness. If f s is satisfiable, so is f t , and vice versa. 

3. Factor graph preserving. Any two instances f s and f' s with the same factor 
graph are reduced to two instances f t and f[ that have the same factor graph. 

To be useful for our purposes, we would like the FGPR to also have a gap 
amplification aspect. Namely, if f s is not satisfiable, then the fraction of clauses 
satisfiable in f t is smaller than the fraction of clauses satisfiable in f s . 

Theorem [3] will be broken into two sub-theorems, each of which is proved 
using FGPRs. 

Theorem 5. There are APX-universal factor graphs for max-3SAT. 

Theorem 6. There is a reduction from APX-universal factor graphs for max- 
3SAT to 77 /80-universal ones. 

The proof of Theorem [5] strongly relates to the work of Alekhnovich et al. [2]. 
As explained in Section 11.41 in that work various APX-hardness results with 
preprocessing were obtained. Among them, there were APX-hardness results 
with preprocessing for certain CSPs (satisfying quadratic equations). It is not 
difficult to use these results in order to obtain APX-universal factor graphs 
for max-3SAT. However, we present an alternative proof because [2] claims the 
relevant theorem without providing a proo|l|. Our proof is patterned after a proof 
of the PCP theorem due to Dinur [5]. 

Recall that Dinur's proof is based on a sequence of gap amplification steps. 
However, some of these transformations are not factor graph preserving. Our 
proof performs a sequence of gap amplifying FGPRs, starting with the outcome 
of Theorem [TJ and eventually proving Theorem [5] Every FGPR is based on 

1 Quoting from [2]: "The proof of this theorem, which is a laborious and an almost 
exact mimic of the proof of the PCP Theorem, is beyond the scope of this version 
of the paper." A subsequent paper [16] that extends [2] no longer uses this theorem, 
and hence does not contain the proof either. 
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modifying Dinur's proof (or more exactly, on modifying a variation on Dinur's 
proof that is given in [21]). The modifications are related to those discussed 
below for the long code (though our proof for Theorem [5] uses a quadratic code 
rather than the long code). 

The proof of Theorem [6] involves an FGPR from APX-universal factor graphs 
for max-3SAT to 77/80-universal ones. Our proof is based on a modification of 
the proof of Bellarc ct al. [6 , and consequently obtains the same hardness ratio 
of 77/80. The main difficulty we encounter is the following. Tight or nearly tight 
hardness of approximation results use the so called long code. A major reason 
why it is used is that its high redundancy allows one to replace explicit queries 
that check whether an underlying predicate is satisfied by an implicit operation 
(referred to as folding) that allows one to avoid making these queries. The only 
queries that need to be made are those that check whether the encoding is really 
(close to) a long code. The saving in queries translates to stronger hardness of 
approximation results. The problem with folding is that it is sensitive to the 
predicate that needs to be checked, and a change in the predicate (e.g., changing 
the polarity of a single variable in a 3SAT clause) changes the folding. As a 
result, query locations change, and the resulting reduction is not an FGPR. To 
overcome this problem we introduce a notion of oblivious folding of the long code, 
which does allow us to eventually obtain an FGPR. We remark that it was not 
a-priori obvious that a construct such as oblivious folding should exist at all. In 
particular, tight hardness of approximation results for 3SAT by Hastad [12] use 
a notion related to folding but somewhat stronger, that is called conditioning 
of the long code. We were unable to find an "oblivious" version of conditioning 
that can replace the conditioning used by Hastad, and consequently we do not 
know if 7/8-universal factor graphs for 3SAT exist. 

For the full proofs of Theorems [5] and [6] see appendices O and [D] 

2.3 Threshold-Universal Families with Nearly Tight Bounds 

Recall that the prefix E (for exact) in EfcSAT indicates that every clause in the 
CNF formula contains exactly k literals (rather than at most) and no two literals 
in a clause correspond to the same variable. It is not difficult to see that the proof 
of Theorem [3] in fact gives E3CNF formulas, and not just 3CNF formulas (and 
even if not, there are simple FPGRs from max-3SAT to max-E3SAT, with only 
a bounded loss in the approximation ratio) . Our proof of Theorem [4] is based 
on a direct reduction from instances of max-E3SAT to instances of max-EfcSAT. 
This reduction has the property that mere APX-hardness of max-E3SAT suffices 
in order to get nearly tight hardness of approximation ratios for the resulting 
max-EfcSAT instances, if k is sufficiently large. 

Proof. Theorem[3]implies that there is a (1 — 7)-universal family of factor graphs 
for E3-CNF formulas, for some < 7 < |. We shall use this in an FGPR to prove 
Theorem [4] For simplicity of the presentation we shall describe our reduction as 
a reduction from a single E3-CNF formula cj>3 to a single Efc-CNF formula <fik- 
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As the factor graph resulting for <f>k will be independent of polarities of variables 
in 3 , this will be an FGPR. 

Let </>3 be an E3-CNF formula with n variables and m clauses for which one 
wants to distinguish between the case that it is satisfiable and the case that it is 
at most (1 — 7)-satisnable. Formula 0fc will be obtained from a combination of 2 q 
auxiliary Efc-CNF formulas called ipi, for < i < 2 q — 1. Let q — k— 3. Introduce q 
fresh variables y\, . . . , y q , and 3 fresh variables z\,Z2,z$. Formula ipo is obtained 
from 03 by adding the y variables (all in negative polarity) to each clause of 
03. As to the other formulas indexed by i > 1, each such formula ipi has eight 
clauses, where each clause contains the variables yi, . . . y q , zi, 22, Z3. Excluding 
the all negative polarity combination, there are 2 q — 1 remaining combinations 
of polarities for the q variables of type y. Each such combination of polarities 
will be associated with the clauses of one ipi for i > 1. One may think of the 
binary representation of i as specifying the polarity of the y variables in clauses 
of ipi, where if the j'th bit of i is then yj is negative, and if the j'th bit of i 
is 1 then yj is positive. As to the z variables, there are 8 possible combinations 
of polarities. Within a formula ipi there are 8 clauses, and each of them has a 
different combination of polarities for the z variables. 

The formula <pk will be a weighted mixture of the ipi (see Appendix [E] regard- 
ing an unweighted version) . Formula ipo is taken with weight gL (which is larger 

than 1 because 7 < |), spreading this weight equally among its m clauses. Each 
of the other ipi is taken with weight 1, spreading the weight equally among its 8 
clauses. The total weight of 0fc is 2 q — 1 + 

If 03 is satisfiable, so is 0^ : an assignment to the original variables of 03 that 
satisfies 03 also satisfies ipo, and assigning true to all y variables satisfies all ipi 
for i > 1. If 03 is only 1 — 7 satisfiable then the weight of unsatisfied clauses in 
01; is at least |: if all variables y are assigned true, this results from ip , and in 
all other cases, this results from one of the other ipi. 

The total weight of fc is W = 2 9 -l+i, and for q satisfying 2 q > l^(A--l) 

we have that W < which implies that g > W< ^2h ■ Hence <pk is at most 

^1 — -satisfiable, as desired. □ 
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A Polytime-Universal Families 

Proof of Theorem [T] 

Proof. We design a universal factor graph for 3CNF formulas that have n vari- 
ables and any number of clauses. For simplicity of presentation, we use the 
following convention. A clause is a tuple of three variables that need not be 
distinct, and the polarities of the variables. Two clauses may not have the same 
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tuple, though two clauses may have the same set of variables if the order in 
which they appear in the respective tuples is different. Hence there are exactly 
n 3 possible tuples, and the number of clauses satisfies m < n 3 . 

The universal 3CNF formula is constructed as follows. Write down all n 3 
possible tuples. For every tuple Tj introduce an auxiliary variable For each 
tuple Ti introduce two clauses as in the following example. If Tj = x%, x%, X3 then 
the two clauses are (x± V12V Zi) and (X3 V Zi V z{). This gives a formula F with 
N = n + n 3 variables and M = 2n 3 clauses. 

Every 3CNF instance / with n variables can be embedded in F, by appro- 
priately setting only the polarities of variables. For a given tuple Ti, if no clause 
with this tuple is in /, then in F give all occurrences of Zi positive polarity. By 
setting Zi to true this corresponding tuple drops also from F . But if a clause 
with tuple Ti appears in / (there can be at most one such clause) , then do as in 
the following example. If the clause is (x\ V x-i V £3) then in F set the polarities 
of the clauses derived from Ti to (x\ V '^V ' z~i) and (x^ V 2^ V Zi). Any assignment 
that satisfies these two clauses in F satisfies also the original clause in /. 

The above implies that the factor graph of F is polytime-universal (for 3SAT 
with n variables). Any algorithm that decides satisfiability for formulas whose 
factor graph is that of F can be used to decide satisfiability of any 3 CNF formula 
with n variables, by following the above embedding. □ 

B Subexponential-Universal Factor Graphs 

In this section we prove Theorem [2j Recall that this involves proving Lemma [2] 
and Lemma [U and then combining them appropriately. 

Lemma 2. Given a 3 CNF formula (p with n variables and any number of clauses 
(at most 0(n 3 ) as clauses may be assumed to be distinct), for every < e < 1/10 
there is an algorithm that runs in time 2°(™ •* and produces at most 2 n new 
3CNF formulas, each with n variables and at most 0(n 1+2e (log n) 2 ) clauses, 
such that if is satisfiable iff at least one of the new formulas is satisfiable. 

Lemma [2] follows by substituting e(n) = n~ e in the following lemma. 

Lemma 3. Given a k- CNF formula, cp, with n variables, < e(n) < 1, and 
a (n) with i Q g!^\n) ^ 4/c2 fe_1 e _1 (n), there is an algorithm that produces at most 
2t(n)n J--CNF formulas, each with at most nk (4a (n)) 2 clauses andn variables 

in time 2 e ^ n n ( - 4a( - n ^ poly (n) such that ip is satisfiable iff at least one of the 
outputted formulas is satisfiable. 

Proof. A similar statement was proved by Impagliazzo et al. [14] with constant 
e and a, and the same proof works when they are not constant. □ 

We now prove Lemma [1] 
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Proof. We first construct a nondeterministic circuit that receives as input an 
E3CNF formula with n variables and m clauses and outputs 1 if the formula 
is satisfiable. We wish to keep the circuit small, of size O (m log m log n) . For 
this reason, the nondeterministic aspect of the circuit will not be a guess of 
the assignment to the variables (which amounts to n nondeterministic guesses), 
but rather a selection of one index per clause (hence m nondetermistic guesses, 
each among three possibilities), indicating a literal that satisfies this clause. 
The consistency of all these selections (namely, not selecting a variable in one 
clause and its negation in a different clause) will be checked using a circuit that 
mimics an oblivious sorting network. All selected literals will be sorted, implying 
that if there is a variable who was selected both positively and negatively, these 
two contradicting selections will "meet" during the sorting processes and the 
inconsistency will be detected. As there are oblivious sorting networks that sort 
m numbers using O(mlogm) comparisons [T], the size of the circuit will remain 
bounded by 0(ni log m log n) (the extra logn term comes from the fact that it 
takes log n bits to specify each of the sorted numbers) . Such a nondeterministic 
circuit outputs 1 iff the formula is satisfiable: a consistent selection of literals 
can always be completed to a satisfying assignment (by giving arbitrary values 
to variables for which no occurrence of their literals was selected) , whereas given 
a satisfying assignment a consistent selection is obtained by selecting the first 
satisfied literal in every clause. 

We now provide more details on the construction of the circuit. The circuit 
takes O (3m logn) input bits. O(logn) bits are used to represent each literal, 
with the least significant bit used to indicate if the literal is negated or not. In 
addition, there are 2 nondeterministic input bits per clause, used to select one 
of the three literals in the clause. The selection of literals can be done by using 
O (logn) 3-to-l multiplexers. The selected literals are sorted using a sorting net- 
work of size O (mlogm) (see |l|20j ). where each comparison is done by adding 
the representation of one literal to the two's complement of the representation 
of the other literal, using O (log n) adders, and the most significant bit of the 
result determines the output of the comparison. The literals are switched or 
not, depending on the result of the comparison, using O (logn) 2-to-l multiplex- 
ers. Lastly, when all literals are sorted, each consecutive pair (with overlapping 
pairs) is checked that it does not contain the representation of a variable and its 
negation (all but last bit equal, using O (log n) gates) . 

Given an E3CNF formula if with m clauses and n variables, we use the cir- 
cuit described above, to construct a 3-CNF formula <P V that is satisfiable iff <p 
is satisfiable. Additionally, if ip is another E3CNF formula with m clauses and 
n variables, the factor graph of is also the factor graph of 
For every input of a gate and for the output of the circuit there will be a variable. 
Note that the output of every gate is an input of some other gate or the output 
of the circuit. 

Each gate contributes a bounded number of clauses to <!> v that encode the re- 
quirement that the output of the gate is correct. For example, a NAND gate 
with inputs x, y and output z would contribute the clauses xVj/Vz, xVyVz, 
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xVyVz, xV y V z. This ensures that a satisfying assignment to <P V is a valid 
calculation of the circuit. That is, every variable has the value passed to or from 
each gate. 

Let o be the variable representing the output of the circuit. The clause o is also 
added to <P V . This ensures that an assignment satisfies the formula iff the output 
of the circuit is 1 . 

Let ii,- ■ • i3 m ([iog 2 n]+i) be the input bits of the circuit. For every j, either the 
clause ij or the clause ij is added to <l> lf ,, depending on the representation of tp. 
This ensures that a satisfiable assignment to <P V has the representation of ip in 
the input of the circuit. Note that the only difference between <1> V and is in 
the polarity of these clauses. 

A standard transformation can be used to transform the formula from 3CNF 
to E3CNF. 

If ^ y is satisfiable, the variables representing the selector bits prove that the 
circuit can be made to output 1, when tp is given as input. If there are selector 
bits that make the circuit output 1 on <p, setting each variable to its respective 
input/output in the circuit, shows that ^ v is satisfiable. □ 

Equipped with Lemmas [2] and [T] we now prove Theorem [2] 

Proof. For simplicity of the presentation, we omit the O notation in the expres- 
sions that we derive. 

Assume that for some 0<<5<l/10a hypothetical algorithm H can solve any 
instance on the universal factor graphs of Lemma[T]in time 2 jyl . Consider now 
an arbitrary 3SAT instance with n variables. For e = <5/3, use Lemma[2]to create 
2™ new 3CNF formulas with at most n 1+2e (logn) 2 clauses. Use Lemma [T] to 
reduce every such 3CNF instance to an instance on a universal factor graph 
with N = n 1+2c (logn) 4 variables. Use algorithm H to solve these instances, 
thus obtaining the solution to the original 3S AT instance. The choice of e = 5/3 
implies that this whole procedure takes time roug hly 2 n . □ 

C APX-Universal Factor Graphs 

In this section we prove Theorem [SJ 

Any universal factor graph for 3SAT (e.g., the result of Theorem[T|) is (l — 
universal, where m is the number of clauses. In order to create a (l-e)-universal 
factor graph (for some e > 0) the instances will go through an iterative process, 
increasing the worst case unsatisfiability of the formulas by a factor of 2, while 
increasing the size of the factor graph by a constant factor. The construction 
is based on the combinatorial method to prove the PCP theorem by Dinur [5], 
and closely follows the proof of Radhakrishnan and Sudan [31] . Familiarity with 
these earlier proofs (an overview of which can be found in [21]) can aide the 
reader in following our proof. 
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C.l Definitions 

Definition 5. A constraint satisfaction problem ( CSP) has the form P = (V, E, C), 
where v is the set of variables, S is the alphabet, and C is the set of constraints. 
A constraint is c = (U,f), where U C V and f : S u {0,1}. An assignment 
is a function a : V — > S, giving each variable a value. Given an assignment 
a, a constraint c — (U, f) is said to be satisfied by the assignment (usually, 
the assignment will be implied from the context) if f (a\jj) = 1, otherwise it is 
unsatisfied by the assignment. 

Given a constraint satisfaction problem P, UNSAT (P) is the minimal frac- 
tion of constraints (over all assignments) that are unsatisfied. The size of P is 
\P\ = \V\ + \C\. 

Definition 6. A constraint hypergraph H — (V, E, S, C) is an alternative def- 
inition of a CSP, where V, S, C are as in the definition of a CSP, and for every 
constraint c = (U, f), U 6 E . 

The structure of a constraint hypergraph H = (V, E, S, C) is the hypergraph 
(V,E). 

The rank of H is the maximal cardinality of an edge. 

In the special case where all sets in E have cardinality 2, H is a constraint 
graph. 

The following definition adds parametrization to the earlier definition of 
FGPR given in Section HOI 

Definition 7. A (<5, e?)-FGPR (Factor Graph Preserving Reduction) is a trans- 
formation of instances of one class of CSP to instances of another class of CSP 
with the additional requirements: 

— If the factor graphs of A and B are equal, then the factor graphs of their 
transformations are equal. 

— There is some constant £ > such that if A is transformed to A' then: 

• if UNSAT (A) = 0, then UNSAT (A 1 ) = 0. 

• if UNSAT (A) > e, then UNSAT (A 1 ) > Smin{e,£}. 

— d\A\ > \A'\ 

For example, the standard reduction from 3SAT to E3SAT is a (i,4)-FGPR 
((i, 2) -FGPR if all clauses have at least two distinct variables). 

In order to create an APX-universal factor graph a (5, d)-FGPR with 5 > 1 
will be constructed, using a composition of several FGPRs. In order to compose 
these FGPRs correctly, each will need to have additional properties. 

It will convenient for us to represent constraints as polynomials. For example, 
a constraint of the form "the first bit in the representation of the variable x is 
equal to the second bit of the representation of the variable y" can be represented 
as requiring that polynomial x\ + ?/2 be equal to (where x\ refers to the value 
of the first bit in the assignment of x, and yi refers to the second bit of the 
assignment of y) . 
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Definition 8. Let E — F£. A constraint e is m-restricted if it can be represented 
as a set of up to m polynomials of degree two, {Pf}, such that the constraint is 
satisfied iff all the polynomials are (where the assignment of variables is treated 
as the values of k Boolean variables). In such case we say that e is associated 
with {P/}. An m-restricted constraint (hyper)graph is a constraint (hyper) graph 
with only m-restricted constraints. 

Definition 9. Two m-restricted constraint hypergraphs are close if they share 
the same factor graph and alphabet, and for every edge e of the factor graph, 
if the constraint corresponding to e is associated with {Pf} then the constraint 
corresponding to e in the other graph is associated with {P? + b\ }, where b\ e 
{0,1}. 

C.2 Changing the Representation 

Lemma 4. There is an explicit (j,6)-FGPR from 3- SAT to 2-restricted con- 
straint graphs. Additionally, if H,H' are created from ip,tp', respectively, using 
the specified reduction and <p, <p' have the same factor graph then H and H 1 are 
close. 

Proof. Each vertex in the constraint graph will represent up to two literals plus 
an additional bit, so E = F|. For a vertex w, Ai (w) will correspond to the i'th 
bit of the assignment of the vertex. The value of the first bits is intended to be 
the value of the represented literals, if true, 1 if false (and the constraints will 
try to enforce that). 

For every variable Vi in the formula there will be a corresponding vertex Wj. 
For every clause of the form Xi V Xj V x k (where x e is V£ or vg) there will be two 
vertices: Uij,u k with an edge between them. The constraint corresponding to the 
edge (v,ij,Uk) expects the following two polynomials to be satisfied: A3 (uij) = 
A\ (uij) A 2 (the last bit is true if one of the represented literals is true) 

and A3 (uij) A\ (iik) = (the clause is satisfied). For every clause of the form 
Xi V Xj there will be two vertices, Ui and Uj with an edge between them with 
the constraint A (ui) A (uj) = (the clause is satisfied). For every clause of the 
form Xi a self loop will be added to the vertex Wi with constraint A\ (vi) = b, 
where b = if xg is vg and 6=1 otherwise. 

In addition, consistency constraints will be added: (uij,Wi) , (u^, Wj) , (v,k, Wk) 
with respective constraints A\ (u^) — A\ (wi) — bf, A2 (u^) — A\ (wj) = bj, 
A\ (uk) — A\ (wk) = b e k , where b\ is or 1, depending on whether the variable vt 
or its negation appear in the clause the first vertex of the edge corresponds to. 

Every clause is responsible for the creation of at most two vertices and four 
edges. Every variable is responsible for the creation of one vertex. Thus \H\ < 
6M. 

If UNSAT (ip) = 0, there is a assignment a, for each Vi satisfying all clauses, 
setting A\ {wi) according to this assignment (0 if ag is true, 1 otherwise), and 
setting A\ (uk) ,A\ (uij) ,A 2 (u^) according to the value of the corresponding 
literal using the assignment will satisfy all edges of the graph. 
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If UNSAT (G) < e, the best assignment to vertices can be transformed into an 
assignment to variables. If an edge is unsatisfied, the clause that generated that 
edge is considered to be unsatisfied. All edges and variables generated by this 
clause will be removed. Repeating this for as long as unsatisfied edges remain, 
we are left with a completely satisfiable graph, with all consistency constraints 
holding. Thus, the assignment for the graph can be transformed to an assignment 
for the formula. Each unsatisfied edge may have caused a single clause to be 
unsatisfied, and since there are at most 4 times as many edges as there are 
clauses, UNSAT (cp) < 4e. 

Lastly, it is immediate that all formulas that have the same factor graph 
generate close constraint graphs. □ 



C.3 Gap Amplification 

Definition 10. An (rj, cQ-expander is a regular graph G = (V,E) with degree 
d and for all S C V with \S\ < \{(u,v) eE\ueS,v<£ S}\ > r)\S\. An 
(r),d)- expander is positive if every vertex has at least ~ self loops. 

Theorem 7 (See several constructions in [13j ) . There are rj, d > such 
that positive (rj,d)- expander graphs exist on n vertices, for all n > 0. 

Theorem 8. There is a universal constant a such that for every k,m,t G N, 
there is an explicit (5* (t) ,C\ (t))-FGPR from (m-restricted) constraint graphs 
with alphabet to (O (mt) -restricted) constraint graphs with alphabet F^ fc,t ^ 
with 8* (t) > at. 

Furthermore, every constraint of the produced constraint graph is a conjunc- 
tion of O (t) constraints from the input graph and equality constraints, and the 
set of constraints only depends on the factor graph of the input. 

Specifically, two m-restricted close constraint graph are transformed into two 
close O (mt) -restricted constraint graphs. 

Proof. The proof closely follows the construction of the transformation in Section 
5 in 121]. 

Let the input of the transformation be a graph G. As in Lemma 5.3 in |21j . 
a regular graph G\ is created from the graph G. Each vertex u G V is re- 
placed by d u (u's degree) vertices, with an (ry, d)-expander embedded on them. 
In addition, each of the new vertices is connected to one of it's neighbors. The 
constraints on the expanders' edges are equality constraints (polynomials of de- 
gree 1). UNSAT (Gi) > <5iUNSAT (G), where 5 X depends on d and 77 (which are 
constants). The proof for the last claim is contained in [21]. Also, If G is satisfied 
it is immediate that Gi can be satisfied. 

As in Lemma 5.5 in |21J, an expander G2 is created from the graph G\ 
by superimposing an (77, <i)-expander on the graph G\, with constraints that 
are always satisfied. Let do be the degree of G\. Then, immediately, G2 is a 
(rj, d + do) expander. By adding d + do self loops with constraints that are al- 
ways satisfied (polynomials of degree 0), G2 is a positive (ry, 2d + 2d )-expander. 
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UNSAT (G 2 ) > £ 2 UNSAT (Gi), where S 2 depends on d (if e of the constraints 
are not satisfied in G2, then 5 2 e of the constraints are not satisfied in G\, using 
the same assignment). Also, an assignment that satisfies G\ satisfies G 2 . 

Lastly, we transform G 2 = (V 2 ,E 2 , S 2 ,C 2 ) to G 3 = (v 2 , E, S^ +do) \ G 3 ) , 
where G3 is the product constraint graph of G 2 , as defined in definition 5.13 in 
[21] . The value of v G V 2 is supposed to be the concatenation of all the values 
of vertices of distance at most t from v in G 2 {{d + do) is an upper bound on 
the number of vertices at distance at most t from any vertex). That is, for an 
assignment A and every two vertices v, u e V 2 there is A u (v) 6 S 2 U {0} (where 
^ S2), the opinion of i> on u. A u (v) = iff the distance between u and u is 
more than t, and then we say that v has no opinion on u. 

In order to define the edges, we use some arbitrary order on the d neigh- 
bors of each vertex. The edges are intended to represent a simple random walk 
on the graph starting at a random vertex and stopping after each step with 
probability j. If the random walk does not stop after 5i + 1 steps we call it 
a null walk and terminate it. There is one edge for each element of the set 
V x ({1 . . . d} X {1 . . . t}) 5t . Each edge and its corresponding constraint are de- 
termined by a walk defined by (a, . . . (*5t, Jst)) in the following way: The 
walk starts from a = vq (which is in V 2 ). In step k (starting from k = 1), the 
walk moves to the neighbor of Ufe-i numbered by ik, and calls this vertex Vk- If 
jk — 1, we stop (to simulate stopping with probability 7), otherwise, we con- 
tinue to step k + 1, until k = 5t. We call the last vertex reached b. If the walk 
stops before reaching step 5i + 1, a walk that is not null, the corresponding 
constraint is checking that the opinions of a and b are the same on the vertices 
on the path (including a and b) , if both have opinions on the vertices and that 
the constraints of G 2 on the edges of the path are satisfied. Note that equality 
constraints can be modeled as requiring that a linear polynomial is 0. Thus, this 
creates an O (mt)-restricted constraint. For a null walk, the constraint is a self 
loop that is always satisfied. 

Note that the transformations transformed close graphs to close graphs and 
that the size of the graph increased by a factor that depends only on t (since 
degrees of expanders were chosen to be constants) 

The proof for the last transformation increasing the unsatisfiability is de- 
scribed throughout most of Section 5 of [3T] . □ 

C.4 Alphabet Reduction 

Definition 11. The Hadamard code 0/ a binary string x — x\x 2 . . . G {0, 1} 
is the value of all linear functions {0, 1} — > {0, 1} on the bits of x. Given some 
arbitrary ordering on the linear functions, the i 'th bit of the Hadamard code of 
x is the value of x on the i 'th function. 

The quadratic code of a binary string x = X\X 2 . . . xg £ {0, 1} is the value 
of all homogeneous quadratic functions {0, 1} — > {0, 1} on the bits of x. Given 
some arbitrary ordering on the quadratic functions, the i 'th bit of the quadratic 
code of x is the value of x on the i 'th function. 
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Lemma 5. For every k,m G N there is an explicit : c 2 (k,m))-FGPR from 
close m-restricted constraint graphs with alphabet F* to close 1-restricted con- 
straint hypergraphs with alphabet F 2 . Additionally, the constructed hypergraph 
has rank 4- 

Proof. Given an m-restricted constraint graph G = (V, E, F§, CJ , we construct a 
1-restricted constraint hypergraph H = (V' , E', ¥2, C). Following the construc- 
tion of Section 6.3 in [21], for every v £ V, and linear function L : {0,l} fc — > 
{0,1} create a vertex v(L) in V'. For every e S E, for every homogeneous 
quadratic function q : {0, 1} — > {0, 1} create a vertex e (q) in V. For every 
e G E, for every linear function L : {0, l} 2fc — > {0, 1} create a vertex e (£) in V'. 
The value of the assignment A on vertex v is referred to as A (v). 

For every e = («, v) G E, L lt L 2 : {0, l} fc -> {0, 1}, L 3 , L 4 : {0, l} 2fc -> {0, 1} 
linear functions, qi,q2 ■ {0, l} 2fc — > {0,1} homogeneous quadratic functions, 
w G {0, 1} there will be seven constraints and corresponding edges (with mul- 
tiplicity): 

1. A constraint that is satisfied iff A (u {L\j) + A (u (L 2 )) — A(u (L\ + L 2 ))- 

2. A constraint that is satisfied iff A (v (Li)) + A (v (L 2 )) = A(v {L\ + L 2 )). 

3. A constraint that is satisfied iff A (e (L3)) + A (e (£4)) = A (e (L3 + L4)). 

4. A constraint that is satisfied iff A (e (qi)) + A (e (92)) = A (e (gi + 52))- 

5. Let L be the linear function given by L (X, Y) — L\ (X) + L 2 (Y). There is 
a constraint that is satisfied iff A (u (£1)) + A (y (£2)) = A (e (£)). 

6. A constraint that is satisfied iff A (e (£3)) A (e (£4)) = A(e (qi + L3L4)) — 
A(e( qi )). 

7. Let P be a homogeneous degree two polynomial and b G {0,1} such that 
P = SWiPi+b, where {Pi} is the set of polynomials associated with e. There 
is a constraint that is satisfied iff A (e (qi + P)) — A (qi) — b = 0. 

There are 2 e linear functions on £ bits. There are 2*2' quadratic func- 
tions on £ bits. Thus, there are 2 k \V\ + (2 2fe + 2 k(2k -^) \E\ vertices and 7 \E\ ■ 

2 m+ek+2k(2k-l) edges _ Hence < 2 0(fe 2 + m ) | G |_ 

The satisfaction of every hyperedge depends only on the value of at most 4 
vertices. 

Given an assignment A satisfying G, there is an assignment A' satisfying H. 
For every i,v, assign the «'th bit of the Hadamard code of A(v) to the vertex 
v (Li). Note that this satisfies all constraints of type 1 and 2, since the Hadamard 
code is linear. For every i,e — (u, v) assign the i'th bit of the Hadamard code of 
A (u) o A (v) (the concatenation of the binary strings) to the vertex e (Li). Note 
that this satisfies all constraints of type 3. Assign the i'th bit of the quadratic 
code of A(u)oA (v) to the vertex e (qi). A quadratic code of any string x on the 
vertices e (q) (for all quadratic functions q), will pass the constraints of type 6, 
if the Hadamard code of x is on the vertices of e (L) (for all linear functions L), 
and this is the case for A' . The constraints of type 5 check the consistency of the 
Hadamard code between vertices of e (L) and vertices corresponding to u and 
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v, and they are consistent in A' . Lastly, constraints of type 7 are satisfied, since 
the polynomials associated with edges of G are all 0, when given assignment 
A. Since the code on e (q) is linear, the check becomes A' (e (P)) — b, which, by 
definition of A' , is the value of P — b on the assignment A, which is a sum of 
polynomials that are all 0. 

If two constraint graphs G and G* are close, then their transformation, H 
and H* , only differ in the constraints of type 7, in the constant of the associated 
polynomials (actually linear functions). 

The proof that the transformation decreases the fraction of unsatisfiable con- 
straints by at most a constant factor is the same as in Lemma 6.11 in [21] , □ 

C.5 Composition 

Firstly, we transform the hypergraph back to a 3CNF formula. 

Lemma 6. There is an explicit (<5| (h) , C3 (h))-FGPR from close 1-restricted 
constraint hypergraphs with alphabet ¥2 to 3- SAT, where h the rank of the hy- 
pergraph. 

Proof. Given a 1-restricted constraint hypergraph H = (V,E,U,C), a 3-CNF 
formula tp is defined. 

For every vertex v € V, there is a variable v. For every edge e, there is a 
variable w e . 

For every edge e, the corresponding constraint c e is satisfied iff the quadratic 
polynomial P e + b e is evaluated to 0, where P e is homogeneous. There is a set of 
at most 2 h 3-CNF clauses that is evaluated to true iff P e (ui, u 2 , ■ ■ ■ , Uh)+w e = 0. 
Adding to this set the clause w e if b t = 1 and the clause w e otherwise, gives a 
set of at most 2 h + 1 clauses that are all satisfied iff the constraint c e is satisfied 
(using the same assignment, omitting the variables of the form w e ). 

Thus, we have that \<p\ < (2 h + l) \H\. Also, if UNSAT (ip) < e, then UNSAT (H) < 
(2 h + l) e. Finally, it is immediate that transforming close 1-restricted constraint 
hypergraph the resulting 3CNF formulas all have the same factor graph. □ 

Now we can compose all transformation to get the required FGPR. 
Theorem 9. There is an explicit (2,c 4 )-FGPR from 3-SAT to 3-SAT. 

Proof. Using lemma |4j theorem [51 lemma and lemma |6l there is a 

(^atS* 2 S* (4) , 6 Cl (t) c 2 (2 S < 3 ^, O (t)) c 3 (4) J - FGPR 

from 3-SAT to 3-SAT formulas, for all t (it is easy to verify that the composition 
of these FGPRs is indeed an FGPR). Specifically, there is a constant t to get a 
(2, c 4 (O)-FGPR. □ 



Starting with a universal factor graph for 3-SAT, O (log n) repetitions of Theo- 
rem [9] proves Theorem [5] 
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D Threshold-Universal Factor Graphs 

D.l Oblivious folding of the long code - an overview 

A major ingredient in tight hardness of approximation results is the long code, 
introduced in [6]. We present here an overview of the difficulties involved in 
adapting hardness proofs based on the long code to the oblivious setting, and 
of our approach for handling these difficulties. A superficial familiarity with 
previous work should suffice in order to follow this overview - there is no need 
to know concepts related to the analysis of the long code, such as dictatorship 
tests and Fourier analysis. 

For a given value of k, the long code replaces a vector x of k original Boolean 
variables by a vector z of 2 2 new Boolean variables. The method of doing this 
can be visualized as follows. Consider a 2 k by 2 2 matrix LC (for long code). The 
rows are indexed by all 2 fc possible assignments to the x variables. The columns 
are indexed by all possible Boolean functions. Namely, the column vectors are 
all possible 2 2 truth-tables for Boolean functions on k variables (equivalently, 
all possible column vectors of dimension 2 fc ). Every row of the matrix is then 
the long code of its corresponding assignment. The z variables are intended to 
correspond to the columns, and their values (as a vector) are intended to be 
equal to one row in the matrix (one long code). A verifier may perform various 
tests to verify (in a probabilistic sense) that this is indeed the case. Moreover, 
the verifier would like this row to correspond to an assignment to the x variables 
that satisfies some predicate h. The concept of folding assists in these tests. 

Folding over true. Columns in LC can be paired, where each column (say zf) 
is paired with the column (say zf) that is its complement. If z is indeed a long 
code, then in each such pair, one of the two corresponding variables is redundant 
and hence is dropped. For example, to read the value of zj, read the value of Zi 
and flip this value. After folding over true, half the number of z variables remain. 

Folding over h. Consider the truth table for h (as a column vector). Only 
certain rows of LC have a value of 1 in this vector, and these rows correspond to 
the assignments that satisfy the predicate h. In these rows, the following holds. 
Consider an arbitrary variable Zi whose column corresponds to the function 
/. Then its value is exactly the complement of the variable Zj whose column 
corresponds to the function f + h (addition modulo 2). Hence again columns 
can be paired (columns that differ by h), and one member from each pair can be 
dropped. After folding over h, half the number of z variables remain. We remark 
that folding over true is a special case of folding over h, for a trivial choice of h 
as the predicate that always accepts (its truth table is all 1). 

Conditioning over h. Consider the rows correspond to the assignments that 
satisfy the predicate h, and let r denote their number. The columns restricted to 
these rows can be partitioned into 2 r equivalence classes, based on equality. Only 
one member from each equivalence class needs to be retained, as its z-variable 
determines the value of the corresponding z variables for all other columns in 
its equivalence class. Hence after conditioning, the number of z variables that 
would remain is 2 r . 
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Hardness of approximation results involve long code tests that query some 
of the z variables and determine whether their values are consistent with the 
assumption that all z variables form a valid long code. Basically, there are two 
versions of long code tests. One version (as in Bellare et al [5]) involves decoding 
to long codes, in which if the z variables pass the test (with high probability) then 
the conclusion is that there are long codes that are close (in Hamming distance) 
to the vector of values of the z variables. (There might be more than one such 
long code, depending on the error probability that we allow in the long code test, 
but their number is small.) The other version which is the one that leads to tight 
hardness of approximation results (as in Hastad |12j ) involves decoding to linear 
combinations, in which if the z variables pass the test (with high probability) 
then the conclusion is that there are words that are close to the vector of values 
of the z variables, where these words are linear combinations are over a small 
number of long codes words. (Also here there might be several such words that 
are linear combinations.) Folding over h (and also conditioning over h) ensures 
that every decoded long code in the first version corresponds to an assignment 
to the x variables that satisfies h. However, for the second version, there is a 
distinction between folding and conditioning. Conditioning over h ensures that 
every decoded linear combination is over long code words that correspond to 
assignments to the x variables that satisfies h. Folding over h only ensures that 
at least one long code word does. 

We remark that on top of folding (or conditioning) over h, long code tests 
employ also folding over true (which regardless of other folding or conditioning 
operations, drops half of the previously remaining z variables). This is relevant to 
our discussion, as it illustrates the principle that two z variables can be deemed 
equivalent not only if they have equal values in all long codes of interest (e.g., 
all long codes that correspond to assignments to the x variables that satisfy the 
predicate h), but also if they have complementary values in all these long codes. 

In order to obtain universal factor graphs, we need that even if h is changed 
due to a change of polarity of variables (e.g., from x\ VX2 Va;3 to x\ VX2 VX3), the 
factor graph does not change. However, changing h changes the pairing in the 
folding and also changes the equivalence classes in the conditioning, and hence 
changes which z variables remain. As a consequence, the resulting factor graphs 
change. 

Our solution to this problem is through the notion of an oblivious folding. 
Consider for example a predicate h corresponding to clause Cj — x\ V X2 V X3. 
The long code associated with an assignment to the three variables x\ , x% , X3 will 
have 2 2 variables (z variables). Folding over h will pair some of these variables. 
If polarities in Cj change (say, to X\ V X2 V X3) we get a different predicate h' 
that leads to a different pairing. To overcome this problem, oblivious folding 
introduces auxiliary variables and corresponding new predicates. In the example 
above, this entails introducing three fresh auxiliary variables Vji,yj2,yj3 and 
replacing the clause Cj by the conjunction of four constraints, one that we call 
here a shadow constraint yji V yji V ^3, and three equality constraints yji = x±, 
Uj2 = X2, Uj3 = X3. If polarities in Cj change as above, the shadow constraint 
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remains unchanged and only the equality constraints change (to yj2 =fc xi in our 
example). Note that an equality constraint is simply negated when the corre- 
sponding x variable is negated (equality changes to inequality). Now consider 
a long code associated with an assignment to six variables (namely, to the x 
variables and the auxiliary y variables). The number of z variables is now 2 2 . 
Consider folding over an equality constraint h! . If the constraint is negated to 
give an inequality constraint h' , then rather than change the pairing in the fold- 
ing (to be folding over h! rather than over h'), one can instead keep the same 
folding as for the original h' , but flip the nature of the relation between the two 
z variables that form a pair (instead of requiring them to be different, requiring 
them to be the same). 

In summary, our oblivious folding is performed as follows. Given a set of x 
variables and a predicate h, we add auxiliary y variables and express h as the 
conjunction of a shadow predicate over the y variables and equality constraints 
between the x and y variables. We consider the long code with respect to the 
combination of all variables. Thereafter we fold over all the new predicates one by 
one (the order does not matter) , eventually giving a partition of the z variables 
into equivalent classes. In fact, for the shadow constraint we could use condi- 
tioning rather than folding - the main aspect is that for each of the equality 
constraints we use folding. If the polarity of an x variable changes, the partition 
of z variables does not change. The only change is the relation among z variables 
within the partition - two variables that were previously deemed equal might 
now be considered as negations of each other. As a result, when an x variable 
is negated the factor graph over the z variables remains unchanged, and only 
polarities of some of the z variables change. 

Our oblivious folding can be used in conjunction with the proof of Bellare 
et al [6] (after proper modifications) , because our equivalence classes capture 
all equivalences used by (standard) folding over h. We do not know whether 
our oblivious folding can be used in conjunction with the proof of Hastad [12] . 
because oblivious folding does not capture all equivalences captured by condi- 
tioning. For example, given a long code (of length 2 2 ) for t variables, condition- 
ing over an equality constraint between two variables creates 2 2 equivalence 
classes, whereas folding (which is an oblivious folding, since we only have an 
equality constraint) creates 2 2 * -1 equivalence classes. As explained earlier, when 
decoding to a linear combination of long codes, we know that for each equality 
constraint at least one long code in the linear combination satisfies it. However, 
several equality constraints are introduced by our oblivious folding, and it might 
be the case that none of the long codes in the linear combination satisfy all of 
them. 

D . 2 Proof of Theorem H 

Recall that the FGPR in Lemma [5] creates a 1-restricted constraint hypergraph 
of rank 4, such that the degree of every vertex depends only on the size of the 
alphabet and the degree of the constraint graph. Additionally, the FGPR in 
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Theorem [8] creates a bounded degree graph with alphabet that depends only 
on t. Thus, it is possible to use the series of FGPRs in Theorem [9] without the 
FGPR in Lemma H2 to get a universal factor graph for 1-restricted constraint 
hypergraphs of rank 4, such that the degree of every vertex is bounded by a 
universal constant. 

We now follow the proof of Bellare et al. [6] (with the modification of the 
folding) to show hardness of approximation within a factor of II . 

Given some fixed 1-restrictcd constraint hypergraph H with alphabet {0, 1}, 
rank 4 and bounded degree, consider the two prover game H u : The outer verifier 
picks uniformly at random a set C of u hyperedges (constraints) . For a constraint 
c, let B c be the set of bits the value of c depends on. For every c € C, the outer 
verifier picks uniformly at random a bit in B c . Call the set of selected bits B. 
Then, the outer verifier gives Pi the set C and gives Pi the set B. Pi is expected 
to return a satisfying assignment to each edge. Pi is expected to return an 
assignment to the set of bits, consistent with the assignment Pi gives (the same 
assignment for the same bits). The outer verifier accepts if both conditions on 
the responses of the provers hold. 

For u = 1, it is easy to see that the answers of P2 define an assignment, 
a, for the graph and the answers of Pi define an assignment for each edge, 
separately. If an assignment for some edge is inconsistent with a and the verifier 
chose this edge, the probability that the verifier finds the inconsistency is at 
least 4 (the probability that P2 is asked to reveal an inconsistent bit). Thus, if 
UNSAT (G) > e, the verifier rejects with probability at least |. It is immediate 
that if UNSAT (G) = 0, there is an assignment that makes the verifier always 
accept. 

Using the Parallel Repetition Theorem [23], for every e, if the original graph 
is at most 1 — e satisfiable, there is c € > such that the verifier V u accepts with 
probability at most c", for all u > 0. Again, if the graph is satisfiable, the verifier 
(after parallel repetition) can be made to always accept. 

Definition 12. The string A e {0,1} " (wh ere T n is the set of all functions 
from n bits to 1 bit) is the long code of a string x £ {0, 1}" if for all f £ T n , 
A f = f(x). 

Definition 13. A £ {0, 1}^" is said to be folded over (h, b) (h £ F n , b £ {0, 1} ), 

if for all f £ T n A f+h = A f + b. 

Let hi, . . . ,hk be linearly independent functions and bi, . . . , bk be bits. Let -< 
be some total ordering of T n - Let /1 (/) be the minimal function among {/ + X) o~ihi\o~i £ {0, 1}}, 
and let <j{ be such that \i (/) = / + X] ^f^i- We say that B £ {0, 1}"^" is the 
folding over (hi, ...,h k ), (h, ...,b k ) of A £ {0, 1}^" if B f = + X a{ b t . 

Note that B is folded over (/ij,6j). 

It is possible to define folding over a set of linearly dependent functions, 
provided that the values that the values of the bits {6^} are consistent with the 
linear dependencies over the functions {hi}. We do not present such a definition, 
because it is not required for our proofs. 
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For our usage of folding, it suffices to fold over linearly independent functions. 

In order to check the validity and consistency of the assignment using dis- 
junctive clauses with 3 literals each, the provers are expected to give the long 
code for the assignment for the variables they are given (the sets C and B as be- 
fore), and an inner verifier will check their answer. Let A be the answer of Pi, D 
the answer of Pi. Let hi, . . . , h u be the homogeneous part of the constraints in C 
(the constraints are degree two polynomials). The verifier will access the folding 
of A over (1, hi, ... , h u ) , (1, &i, . . . , b u ) , where bi, . . . , b u are the constants of the 
respective constraints. Since every vertex appears a bounded number of times, 
independent of the size of the graph, for large enough hypergraphs the functions 
will be linearly independent with very high probability. If they are dependent, 
we can always accept while only slightly increasing the satisfiability of the game. 

Note that when we say that the verifier checks the bit Af, it will actually 
access the bit and will invert it depending on (hi, . . . , h u ) , (&i, . . . , b u ). 

\x (f) only depends on (hi,..., h u ). Close constraint hypergraphs have the same 
constraints, up to a constant. So, if two constraint hypergraphs are close, the 
accessed bit will be the same and the only difference will be in whether the bit 
is negated or not. 

The verifier will be modeled as a 3CNF formula, such that all close constraint 
hypergraph are transformed into a formula with the same factor graph. The inner 
verifier will check one of the following four constraints: 

1. For f,g € Ta u chosen uniformly, check that Af+A g = Af +g . This test passes 
iff the four clauses ~Aj\/A~g~\J Af +g ,~Af~\/ A g V A f+g , A f V ~A~ g V A f+g , A f V 
A g V A f +g are satisfied. 

2. For f,g,h e T^ u chosen uniformly, if Af = 0, check that Af g+ h = Ah- If 
Af — 1, check that Af g+g+ h = Ah- This test passes iff the four clauses 
Af V Af g+h V A h , Af V A fg+h V ~A~ h , T f V A fg+g+h V A h ,Ajw A fg+g+h V A^ 
are satisfied. 

3. For / G Ti u ,g' £ T u chosen uniformly, check that D g i = A g+ f + Af (where 
g is g' extended to the domain of all 3u bits of C, such that g does not 
depend on the bits in C that are not in B). This test passes iff the four 
clauses Aj V Ag+~J WD^,Ajv A g+f V D g , , A f V Ag^J V D g , , A f V A g+f V ~b~, 
are satisfied. 

The verifier chooses which of the constraints to check with some probability that 
will be implied in the proof. 

Definition 14. The distance between two strings x,y € {0,1}^ is the fraction 
of coordinates in which they differ. 

Claim. Let E G {0, 1}^" be a long code of some string x £ {0, 1}™. Let D e 
{0,1}^™ be some string folded over (h,b) such that h(x) ^ b. Then D and E 
are i-far. 

Proof, a — {f\Ef = Df}. If \a\ > \ \T n \, then there is / such that /, / + h e a. 
E f = f (x), Ef +h = f(x) + h (x) = f(x)+b+ 1. However D f+h =D f +b, so 
it is impossible that f,f + h£a. □ 
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Definition 15. Fix 8 > arbitrarily small. We say that A, D are (C, B, 5)-solid 
if A, D are (\ — (5) -close to some A, D, respectively, where A is a long code of an 



assignment a satisfying the edges in C, and D is the long code of an assignment 
d for B , such that d is consistent with a. Usually, 8 will be known from the 
context and will be omitted. 

Claim. For a game G u , let E 1 be the set of random coins such that given the 
corresponding set C, B to Pi, Pi respectively, the answers A, D are (C, P)-solid. 
Then, there are provers P\,P2 for the game G u and a set E of random coins, 
\E\ > |g l-E'l, such that V u will accept for all choices of random coins in E. 

Proof. Consider a bipartite graph where the vertices of one side are all possible 
sets of u constraints queried in G u and the other side is the set of all possible 
sets of variables queried. There is an edge between a set of constraints and a set 
of variables if there is a choice of random coins such that the sets are queried at 
the same time. Note that the graph produced can be seen as a constraint graph, 
representing the game G u with the property that an assignment satisfying k 
edges will satisfy V u for k choices of random coins. 

If a string is 4 — 8 close to a long code, then there are at most 48~ 2 long 
codes close to it (Lemma 3.11 in [5]). From claim [D~2l all of these long codes 
must satisfy the selected constraints. 

For any edge (C,B) and respective answers of (P%,P2), with (A,D) (C,B)- 
solid, there are 16<5 4 possible choices of assignments for C and B (derived from 
the closest 4<5 -2 for each of them), with at least one choice satisfying the edge 
between them. Selecting an assignment randomly satisfies 16<5 \E'\ edges in 
expectation. Thus, there is an assignment for G u that satisfies at least jf- \E'\ 
edges. □ 

Corollary 1. If G u is at most e satisfiable, then for any pair of provers at most 
16(5 -4 e of the answers can be solid for their respective queries. 

Given a query to the provers, let A, D be the answers of Pi, P%, respectively. Let 
x be the distance of A from the closest linear function, A. 

The fraction of tests (of the first type) that will fail is lower bounded by the 
function ([5], also stated as Lemma 3.15 in [5]) 



If A is not a long code, then the fraction of tests that the second test will 
fail is at least | (1 — 2x) (Lemma 3.19 in [5]). 

If A is a long code of a, D is a long code of d, where d is the restriction of a 
on the respective bits, and y is the distance between D and D, then the fraction 
of tests that the third test will fail is at least y (1 — 2x) (Lemma 3.21 in [6]). 

Theorem 10. There is a + e) -universal factor graph for 3-SAT, for any 




e > 0. 
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Proof. Let Q n be the set of graphs of size n produced by the transformation 
from lemma |4] activated on APX-universal factor graph generated in appendix 
O Note that the de gree of every vertex is bounded due to the transformation 
used to create the factor graph. The factor graph of the 3CNF formula checking 
the satisfiability of G u is the same, for all G € Q n . 

Suppose that G cannot be completely satisfied. Then, the fraction of answers 
that can be solid to their respective queries is arbitrarily small (fixing small (5, 
increasing u). Non solid answers fail some of the tests, so our goal is to increase 
the fraction of clauses that non solid answers cannot satisfy. If A, D is the answer 
to the query (C, B) which is not (C, B)-solid then there are several possibilities: 

— A is not a long code, then at least n\A (x) + (1 — 2x) tests fail. 

— A is the long code of a, but D is at least i — 8 far from a long code of the 
restriction of a, then at least n\A (x) + n 3 — <5) (1 — 2x) tests fail. 

Otherwise, A, D is (C, i?)-solid. 

Let ni,n2,ri3 be the number of tests of each type for every query (theses 
numbers uniquely define the probability the inner verifier chooses which of the 
three tests to execute). The total number of clauses is 4 (n\ +n% +n^). If a 
test fails then at least a single clause cannot be satisfied (using the respective 
assignment). For = |n2 all the fractions of failed tests (and the number of 
unsatisfied clauses) is the same for both cases (8 can be arbitrarily small) . Let k 
be the total number of clauses. We need to maximize the fraction of unsatisfied 
clauses, that is 

niA (x) +^(k- 4ni) (1 - 2x) 
k 

Then, the minimum must be at x — 0, x — j^, or x — i, so ni — n 3 = ||, 
n 2 = which gives that at least a fraction of the clauses are unsatisfiable. 

□ 



E A Remark on Nearly Tight Thresholds 

The following remark relates to the proof of Theorem [U 

Remark 1. If 4>z is unweighted, then weighted formula 4»k can easily be replaced 
by an unweighted formula without affecting the correctness of the proof. Scale 
all weights by a multiplicative constant so that clauses in ipo have weight 1, and 
then clauses in ipi for i > 1 have weight 7m. We may assume without loss of 
generality that 7m is an integer. (Otherwise, decrease 7 by a little.) Duplicating 
each formula ipi (for i > 1) 7771 times, each time using a fresh set of new three 
variables as the z variables, gives the desired unweighted version of the formula 

4>k- 
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F Universal Factor Graphs for Additional CSPs 

Many known reductions are in fact FGPRs, and this gives universal factor graphs 
for additional CSPs. Examples include the standard reductions from max-3SAT 
to max-4NAE (adding a variable to all clauses), from max-4NAE to max-3NAE 
(break each clause into two clauses of two literals and add a new variable to one 
clause and its negation to the other) and from max-3NAE to max-2LIN (replace 
every 3NAE clause by three 2LIN clauses, each of which is the XOR of two 
literals from the NAE clause). 



